The tÜbİTAK-UEKAE statistical machine translation system for IWSLT 2008
نویسندگان
چکیده
In this study, the TÜBİTAK-UEKAE statistical machine translation system based on the open-source phrasebased statistical machine translation software, Moses, is presented. Additionally, phrase-table augmentation is applied to maximize source language coverage; lexical approximation is applied to replace out-of-vocabulary words with known words prior to decoding; and automatic punctuation insertion is improved. We describe the preprocessing and postprocessing steps and our training and decoding procedures.
منابع مشابه
The TÜbİTAK-UEKAE statistical machine translation system for IWSLT 2007
We describe the TÜBITAK-UEKAE system that participated in the Arabic-to-English and Japanese-toEnglish translation tasks of the IWSLT 2007 evaluation campaign. Our system is built on the open-source phrasebased statistical machine translation software Moses. Among available corpora and linguistic resources, only the supplied training data and an Arabic morphological analyzer are used in the sys...
متن کاملThe tÜBITAK-UEKAE statistical machine translation system for IWSLT 2009
We describe our Arabic-to-English and Turkish-to-English machine translation systems that participated in the IWSLT 2009 evaluation campaign. Both systems are based on the Moses statistical machine translation toolkit, with added components to address the rich morphology of the source languages. Three different morphological approaches are investigated for Turkish. Our primary submission uses l...
متن کاملThe TÜBITAK-UEKAE statistical machine translation system for IWSLT 2010
We report on our participation in the IWSLT 2010 evaluation campaign. Similar to previous years, our submitted systems are based on the Moses statistical machine translation toolkit. This year, we also experimented with hierarchical phrasebased models. In addition, we utilized automatic minimum error-rate training instead of manually-guided tuning. We focused more on the BTEC Turkish-English ta...
متن کاملThe TÜBİTAK statistical machine translation system for IWSLT 2012
We describe the TÜBİTAK submission to the IWSLT 2012 Evaluation Campaign. Our system development focused on utilizing Bayesian alignment methods such as variational Bayes and Gibbs sampling in addition to the standard GIZA++ alignments. The submitted tracks are the ArabicEnglish and Turkish-English TED Talks translation tasks.
متن کاملThe LIUM Arabic/English statistical machine translation system for IWSLT 2008
This paper describes the system developed by the LIUM laboratory for the 2008 IWSLT evaluation. We only participated in the Arabic/English BTEC task. We developed a statistical phrase-based system using the Moses toolkit and SYSTRAN’s rule-based translation system to perform a morphological decomposition of the Arabic words. A continuous space language model was deployed to improve the modeling...
متن کامل